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FRED B. HOLT 

Abstract. We study the gaps between consecutive prime numbers di- 
rectly through Eratosthenes sieve. Using elementary methods, we iden- 
tify a recursive relation for these gaps and for specific sequences of con- 
secutive gaps, known as constellations. Using this recursion we can 
estimate the numbers of a gap or of a constellation that occur between 
a prime and its square. This recursion also has explicit implications 
for open questions about gaps between prime numbers, including three 
questions posed by Erdos and Turan. 
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1. Introduction 

We work with the prime numbers in ascending order, denoting the k th 
prime by pk . Accompanying the sequence of primes is the sequence of gaps 
between consecutive primes. We denote the gap between pk and Pk+i by 
9k = Pk+i — Pk- These sequences begin 

pi = 2, p 2 = 3, p 3 = 5, pi = 7, p 5 = 11, p G = 13, ... 
51 = 1, g 2 = 2, 53 = 2, 54 = 4, g 5 = 2, g e = 4, ... 

A number d is the difference between prime numbers if there are two 
prime numbers, p and q, such that q—p = d. There are already many inter- 
esting results and open questions about differences between prime numbers; 
a seminal and inspirational work about differences between primes is Hardy 
and Littlewood's 1923 paper jlOj . 

A number g is a gap between prime numbers if it is the difference between 
consecutive primes; that is, p = p% and q = pi + \ and q—p = g. Differences of 
length 2 or 4 are also gaps; so open questions like the Twin Prime Conjecture, 
that there are an infinite number of gaps gk = 2, can be formulated as 
questions about differences as well. 
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A constellation among primes |22 j is a sequence of consecutive gaps be- 
tween prime numbers. Let s = a\a2 ■ ■ ■ be a sequence of k numbers. Then 
s is a constellation among primes if there exists a sequence of k + 1 consec- 
utive prime numbers PiPi+i ■ ■ -Pi+k such that for each j = 1, . . . , k, we have 
the gap pi + j — Pi+j-i = dj. Equivalently, s is a constellation if for some i 
and all j = 1, . . . , a, = 

We will write the constellations without marking a separation between 
single-digit gaps. For example, a constellation of 24 denotes a gap of gk = 2 
followed immediately by a gap gu+i = 4. The number of gaps after k 
iterations of the sieve is = rii=:i(P« — -Q- For the small primes we will 
consider explicitly, most of these gaps are single digits, and the separators 
introduce a lot of visual clutter. We use commas only to separate double- 
digit gaps in the cycle. For example, a constellation of 2, 10, 2 denotes a gap 
of 2 followed by a gap of 10, followed by another gap of 2. 

We use elementary methods to study the gaps generated by Eratosthenes 
sieve directly. By studying this sieve, we can estimate the occurrence of 
certain gaps and constellations between pk and p\. 

From the methods developed below, we can calculate exactly how many 
times a sequence s of gaps occurs after k stages of Eratosthenes' sieve. We 
don't know how many of these occurrences will survive subsequent stages 
of the sieve to become constellations among prime numbers. However, for 
a prime p we can make estimates for the number that occur before p 2 , all 
of which will survive as constellations among primes. Thus our estimates 
and counts are only coincidentally commensurate with tabulations against 
powers of ten. 

The product of the first k primes will be denoted by Hk = Yii=iPi- 

By the p^-sieve, we mean those positive integers remaining after removing 
all the multiples of the first k prime numbers. The p^-sieve has a fundamen- 
tal cycle of $fc elements modulo LT^. Most often we picture this fundamental 
cycle as the generators for Z mod 11^, although it is also attractive to visu- 
alize these as the primitive LT^ 1 roots of unity in C. 

1.1. Organization of the material. We proceed as follows. We identify 
a recursive algorithm for producing each cycle of gaps G(Pk+i) from the 
preceding cycle Q{pk)- This recursion enables us to enumerate various gaps 
and constellations in the p^-sieve. In the cycle of gaps Q{pk) of course, all 
the gaps from pu+i and p\ + i are actually gaps between prime numbers. 

We make a conjecture about the uniformity of the distribution of these 
gaps and constellations. From this conjecture we can make statistical esti- 
mates about the expected number of occurrences of these gaps and constel- 
lations below p\ + i, and we compare these estimates with actual counts. 
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We make a weaker conjecture that every constellation in Q{pk) occurs 
infinitely often as a constellation among primes, provided the sum of the 
gaps in the constellation is less than 2pk+i- From this weaker conjecture we 
address several questions about gaps and differences between prime numbers. 
We show that Hardy and Littlewood's fc-tuple conjecture on the differences 
between prime numbers |1U] is equivalent to a conjecture on gaps. We are 
also able to give exact answers to three questions posed by Erdos and Turan 



1.2. New results. This paper applies elementary methods to Eratosthenes 
sieve. In a general sense, this is well-trodden ground. However, the specific 
insight of identifying the recursion on gaps appears to be new. We cast 
Eratosthenes sieve as a recursive operation directly on the cycle of gaps. 
By studying this recursion we can enumerate particular gaps at every stage. 
Moreover we observe how much structure of the cycle of gaps at one stage of 
the sieve is preserved in subsequent stages. We can thereby easily enumerate 
the occurrences of specific constellations of primes that have not previously 
been approachable (e.g. 2,10,2). 

The conclusions on the recursion of gaps are precise. To go further, we 
need to supplement our rigorous work with an appropriate conjecture. We 
first make a strong conjecture, that under the recursion the copies of a spe- 
cific constellation eventually approach a uniform distribution in the cycle. 
This conjecture on a uniform distribution allows us to make estimates of the 
occurrences of constellations in the sieve as constellations among prime num- 
bers. These new estimates compare favorably with existing estimates, and 
they allow us to estimate the occurrences of other interesting constellations 
that have lain beyond the reach of existing techniques. 

Backing off from the strong conjecture on uniformity, we make a weaker 
conjecture, that sufficiently small constellations in the sieve occur infinitely 
often as constellations among prime numbers. This conjecture implies that 
the Twin Prime Conjecture is true. However, it goes further. By identifying 
specific constellations and using the action of the recursion, we answer three 
questions posed Erdos and Turan [6]: 

i) Spikes. limsup# n /s! n+ i = oo and liminf g n /g n+ i = 

ii) Oscillation. There is no uq such that for all k > 1, g nQ +2k-i < 9n +2k 
and g no +2k > 9n +2k+i- 

iii) Superlinearity. gj < gj+x < . . . < gj + k does have infinitely many 
solutions for every k. 

The progress exhibited in this paper is the result of new elementary 
insights into Eratosthenes sieve. Specifically, we can track the cycle of gaps 
explicitly through stages of the sieve. While previous methods have jumped 
immediately to probabilistic estimates, we examine the deterministic effect 
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that the recursion has on subsequences in the cycle of gaps. Only after 
we have exhausted the exact results on these constellations, do we turn to 
simpler probabilistic estimates. 

2. Related Results 

There are of course several avenues of research into the distribution 
of primes. Research into constellations has been motivated primarily by 
two conjectures: the twin primes conjecture, and Hardy and Littlewood's 
broader fc-tuple conjecture [TP] |2"U]. 

The twin primes conjecture asserts that the gap g = 2 occurs infinitely 
often. Work on this conjecture has included computer-based enumerations 
[121 [El H] and investigations of Brun's constant [22l [TOJ [20] . Brun's 
constant is the sum of the reciprocals of twin primes. This series is known 
to converge, and the sharpest current estimate [20] is 1.902160577783278. 
One generalization of the twin primes conjecture is a conjecture by Polignac 
from 1849 [20] that for every even positive integer TV" there are an infinite 
number of gaps = N . 

Hardy and Littlewood [10, 21J formulated their prime fc-tuples conjecture 
in this form: if b\, . . . , b^ is an admissible fc-tuple, then there are infinitely 
many x such that x + &i, . . . , x + b^ are all prime. Admissibility in this 
context is a condition on residue classes modulo smaller primes. 

The work in [TU] supports related conjectures estimating the numbers of 
specific differences that should occur in the interval [2,iV] for any N. For 
a difference d, their Conjecture B asserts that the number Cd(N) of prime 
pairs (p, p + d) with p < N is asymptotically 




in which q runs over the odd primes. The constant C2, known as the twin 
prime constant [TO] [22] is given by the infinite product 

* = nf^| = ™60l6l8.... 

For an admissible /c-tuple b%, . . . , b}~, Hardy and Littlewood [TU] conjectured 
the general estimate 

(3) C b (N) ~ 2^-^ J 2 ^rr- x 
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in which q runs over the odd primes and <f> q (b) is the number of distinct 
residue classes of 0, b%, . . . , bk modulo q. 

Hardy and Littlewood's prime fc-tuple conjecture addresses differences 
among primes, but the primes in question need not be consecutive. However 
the fc-tuple conjecture has an equivalent formulation as a conjecture on 



constellations, which we will see in Lemma 6.1 below. For the differences 
d = 2,4 the estimates ([TJ are also estimates for the corresponding gaps 
5 = 2,4. 

Most estimates for sequences of differences, e.g. [TO], [JT], are derived 
by treating probabilities on residues as independent probabilities. In con- 
trast, the recursion identified below in Lemma |3.1| preserves the structure 
in the cycles of gaps at each stage of Eratosthenes sieve. The occurrence of 
constellations in stages of the sieve is entirely deterministic. 

Computational confirmation of these estimates has been carried out by 
several researchers, notably in [3] 115] . The results in this paper and the 
tables and examples previously published are not quite commensurate. The 
tables and examples of [21 H2] [TBI ES] provide estimates and counts of gaps 
with respect to large powers often. Since we work directly with Eratosthenes 
sieve, our estimates are given with respect to intervals [p, p 2 ] for primes p. 

Some researchers have applied their investigations of differences among 
primes to study gaps and constellations. The general surveys [20] [22] provide 
overviews of some of this work, and the estimate ^ from the seminal paper 
[10] can be used for constellations consisting of 2's and 4's, e.g. 24, 42, 242, 
424, etc. Brent [2] applied the principle of inclusion and exclusion to the 
estimates ^ to obtain strong estimates for the gaps 2, 4, 6, ... , 80. Richards 
[21] conjectured that the constellation 24 occurs infinitely often. Clement 
[4] and Nicely [17) have addressed the constellation 242, corresponding to 
prime quadruplets, pairs of twin primes separated by a gap of 4. 

There are other lines of investigation |20[ [7] into the gaps between prime 
numbers. One line |23| fTS] [TBI [T] has looked for the first occurrence of a 
gap. Cramer [5] introduced probabilistic arguments to derive asymptotic 
estimates for the sequence of gaps = 0(ln 2 pfc). Quite recently, Green 
and Tao [9] have offered a proof that there exist arbitrarily long sequences 
of primes in arithmetic progression. By working with the convex hulls of 
the graphs of primes (k,pk) and of the log-primes (fc,lnpfc), Pomerance 
[T9] established a handful of nice results about inequalities involving the 
arithmetic and geometric means of prime numbers. 



3. Recursion for the cycle of gaps 



The possible primes for the 3-sieve are 

(1), 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37, 41, 43, . 
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We investigate the structure of these sequences of possible primes by study- 
ing the cycle of gaps in the fundamental cycle. For example, 42 is the cycle 
of gaps for the 3-sieve. We have 

0(3) = 42, with g 3;1 = 4 and <? 3)2 = 2. 

The lowest entry in the cycle of gaps is one less than the next prime: 
9k,i = Pk+i — 1- Denote the sum of the first j gaps in Gijpk) by Tkj = 
Yli=i 9k,i- For the p^-sieve, the j th possible prime is given by 1 + ^k,j- Since 
we are studying the cycle of gaps in the p^-sieve, we know that there are 3>fc 
elements in one cycle, and the sum of the gaps in one cycle must be H^: 

Tfc,$ fc = n fe . 

There is a nice recursion which produces G(pk+i) from G(pk)- We con- 
catenate Pk+i copies of G(Pk)j an d add together certain gaps as indicated 
by the entry- wise product Pk+i * Gijpk)- So the recursion consists of three 
steps. 

Lemma 3.1. The cycle of gaps G{pk+i) is derived recursively from G(Pk)- 
Each stage in the recursion consists of the following three steps: 

Rl. Determine the next prime, Pk+i = gk,i + 1- 
R2. Concatenate pk+\ copies of G ipk) ■ 

R3. Add together gk t i + gk,2, and record the index for the location of this 
addition as %\ = 1; for n = 1, ...,<&£, — 1, add g^j + gk,j+i and let 
■in+i = j if 

T k,j ~ r fc,i n = Pk+1 * 9k,n- 

Proof. We consider the cycle of gaps in relation to the generators of Z mod 
ilfc. Suppose the differences between consecutive generators in Z mod 11^ is 
the cycle of gaps G(pk)- By induction, we will show this relation holds for 
G(pk+i) and Z mod II fe+ i. 

The next prime Pk+i will be l+5fc,i, since this will be the smallest integer 
both greater than 1 and coprime to Hfc. 

The second step of the recursion extends our list of possible primes up 
to Ilfc+i + 1, the reach of the fundamental cycle for Pk+i- For the gaps 
9k j we extend the indexing on j to cover these concatenated copies. These 
Pk+i concatenated copies of G(pk) correspond to all the numbers from 1 to 
Iljfc+i + 1 which are coprime to il^. For the set of generators of Hk+i, we 
need only remove the multiples of Pk+i- 

The third step removes the multiples of Pk+i- Removing a possible prime 
amounts to adding together the gaps on either side of this entry. The only 
multiples of Pk+i which remain in the copies of G{Pk) are those multiples 
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Figure 1. An illustration of the recursion that produces 
the gaps for the next stage of Eratosthenes sieve. The cycle 
of gaps £7(7) is produced from £7(5) by concatenating 7 copies, 
then adding the gaps indicated by 7 * £7(5). 



all of whose prime factors are greater than pk- After pk+i itself, the next 
multiple to be removed will be p\ +1 - 

The multiples we seek to remove are given by Pk+i times the generators 
of Z mod Hfc. The consecutive differences between these will be given by 
Pk+i * 9k,jj an d the sequence Pk+i * G(Pk) suffices to cover the concatenated 
copies of Gipk)- We need not consider any fewer nor any more multiples of 
Pk+i to obtain the generators for G(pk+i)- 

In the statement of R3, the index n moves through the copy of G{pk) 
being multiplied by Pk+i, and the indices i n mark the index j at which the 
addition of gaps is to occur. The multiples of Pk+i in the p^-sieve are given 
by Pk+i itself and Pk+i * (l + T^j) for j = 1, . . . , <£fc. The difference between 
successive multiples is Pk+i * 9k,j- d 



Example: £7(5). We start with £7(3) = 42. 
Rl. p k+ i = 5. 

R2. Concatenate five copies of £7(3): 

4242424242. 

R3. Add together the gaps after 4 and thereafter after cumulative differ- 
ences of 5 * £7(3) = 20, 10: 

20 10 

£7(5) = 4 + 2424242+ 42^ 
64242462. 

Note that the last addition wraps over the end of the cycle and 
recloses the gap after the first 4. 

Remark 3.2. The following results are easily established for G(p k ) : 
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The first difference between additions is pk+i * (pk+i — 1); which 
removes p\ + i from the list of possible primes. 

The last entry in Q{p) is always 2. This difference goes from —1 to 
+ 1 in Z mod LT^. 

The last difference Pk+i * 2 between additions, wraps from —pk+i to 
Pk+i in Z mod U k+1 . 

Except for the final 2, the cycle of differences is symmetric: gkj = 
9k,$ k -j- 

U 9k,j = 9k,j+l = ■■■ = gk,j+ m = 9, then g = mod p for all primes 
p < m + 2. 

The middle of the cycle Q{pk) is the sequence 
2 j ,2 j ~ 1 ,..., 42424,..., 2 j ~ 1 ,2 j 
in which j is the smallest number such that 2 J+1 > Pk+i ■ 



Example: Q{7). Following the steps in Lemma 3.1 we construct Q(7) from 
Q(5) = 64242462. This recursion is illustrated in Figure [TJ 

Rl. Identify the next prime, Pk+i = gk,i + 1 = 7. 
R2. Concatenate seven copies of G(5): 

64242462 64242462 64242462 64242462 64242462 64242462 64242462 

R3. Add together the gaps after the leading 6 and thereafter after differ- 
ences of 7 * <7(5): 

7*^(5) = 42,28,14,28,14,28,42,14 

42 28 14 28 14 28 42 

Q(7) = 6+424246264242 + 4626424 + "2462* + 6424246 + "2642* + 4246264 + 242462642424+62 
= 10,242462642466264264684242486462462664246264242,10,2 

Note that the final difference of 14 wraps around the end of the cycle, 
from the addition preceding the final 6 to the addition after the first 
6. 

Theorem 3.3. Each possible addition of adjacent gaps in the cycle Q{pk) 
occurs exactly once in the recursive construction of G{jpk+i)- 

Proof. This is an implication of the Chinese Remainder Theorem. Each 
entry in Q{pk) corresponds to one of the generators of 7L mod 14^. The first 
gap gk,i corresponds to Pk+i, and thereafter gkj corresponds to 1 + Tkj- 
These correspond in turn to unique combinations of nonzero residues modulo 
the primes 2,3,... ,pk- In the Pk+i copies of G(pk), each copy of a particular 
gap gkj has its combination of residues augmented by a unique residue 
modulo pk+i- Exactly one of these has residue mod Pk+i, so we perform 
9k,j + 9k,j+i for this copy and only this copy of gkj. □ 

Corollary 3.4. In Q{pk+i) there are at least two entries of2pk- 
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Proof. In forming Q(pk), we concatenate pk copies of Q{jpk-\)- At the tran- 
sition between copies we have the subsequence (pk — l)2(p& — 1). In Q[pk) 
each of the two additions takes place, so the sequences {p^ — l)(pk + 1) and 
(pk + — 1) both occur. In G(Pk+l) the addition in each of these two 

sequences occurs in one of the pt+i copies. □ 



This corollary provides long runs of composite numbers earlier than the 
traditional elementary constructions. For example, suppose we were looking 
for runs of one thousand consecutive composite numbers. In the traditional 
approach, we would take pi^g = 1009 and note that 

{ni 69 + 2, ni 69 + 3, . . . , ni 69 + 1009, ni 69 + 1010} 

and 

{n 169 - 1010, n 169 - 1009, . . . , n 169 - 3, n 169 - 2} 

are runs of 1009 composite numbers. Using the above corollary, we take 
p 9 6 = 503 and note that in {pg7, . . . ,II 9 7} there occur at least two runs of 
1006 composite numbers. 



4. Specific constellations in Q{p) 



A constellation is a sequence of gaps. In this section, we use the recursion 



of Lemma 3.1 to count the number of occurrences of a constellation in G{pk)- 
Then in the next section we estimate how many of these occurrences survive 
as constellations among primes smaller than p\ +v 

Whether a constellation s continues to occur in Q{p) under the recursion 
depends on the number of gaps. 

Lemma 4.1. Let s be a constellation of j gaps in G{pk)- If j < Pk+i — 1; 
then copies of s will appear in all G(p) with p > pk- 



Proof. By Lemma 3.1 any constellation is initially replicated Pk+i times, in 
Step R2. But then by Theorem 3.3, each possible addition occurs exactly 
once, corrupting up to j + 1 copies of s. If j < Pk+i — 1, then at least one 
copy of s survives intact. □ 



By Lemma |4.1| if a constellation is short enough, then it will propagate 
through the Q{p) until the conditions of the following theorem are met, and 
from that point on we can enumerate the occurrences of the constellation 
by the recursive equations provided in the theorem. 

Theorem 4.2. Let s be a constellation of j gaps in G{pk), such that the 
sum of these j gaps is less than 2pk + \. Let S be the set of all constellations 
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s which would produce s upon one addition of differences. Then the number 
N s (p) of occurrences of s in Q (p) satisfies the recurrence 

N s (Pk+i) = (Pk+i ~ (J + 1)) * N s ( Pk ) + Y, N ~M 

Proof. We account for the number of copies of s which survive the recursion 
intact, and we add to this the number of new copies of s generated from 
other sequences. 



The j gaps in s can be closed in j + 1 ways. Theorem 3.3 tells us that 
each of these will occur exactly once. If the sum of the gaps in s is less than 
2pk + i, then these closings are guaranteed to occur in distinct copies of s. 
This establishes the first term on the right-hand side. 

Finally, the summation on the right-hand side accounts for occurrences 
which are generated from other constellations. □ 

Corollary 4.3. Let s be a constellation of j gaps in G(Pk)- If j < Pk+i ~ 1> 
then the number of copies of s, N s (p), will be dominated by the factors 
Q(p — j — 1). Thus all constellations of j gaps grow asymptotically at the 
same rate. 



Although the asymptotic growth rates of all constellations of j gaps are 
equal, the initial conditions and driving terms are important. Brent [2] made 
analogous observations for single gaps (J = 1). His Table 2 indicates the 
importance of the lower-order effects in estimating relative occurrences of 
certain gaps. 

Figure [2] provides a diagram of growths among various constellations. 
For each constellation s, its growth is dominated by the number j of gaps 
through the factors (p — j — 1). Those constellations S which produce s 
after one addition provide the driving terms in the summation, but they 
are added in without a multiplier, and they themselves grow with factors 
(p ~ J ' ~ 2), since they are one gap longer than s. Finally, the counts N s (p) 
for various constellations will vary by their initial conditions. These initial 
conditions are the prime p for which s first occurs in Q{p) and the number 
of these first occurrences. 

Neither 2's nor 4's are generated from any other sequence. In Q{2>) there 
is one 2 and one 4, so the numbers of occurrences of these two gaps will 
continue to be equal through all stages of the sieve. 

From Figure |2]or from the symmetry of G(p), we know that the constella- 
tions 24 and 42 will always occur equally often. However, 242 has no driving 
terms and so it will soon be outnumbered by the constellation 2,10,2. The 
constellation 242 also generates occurrences of the constellations 26 and 62, 
and these in turn generate 8's; a gap of 6 is generated by the constellations 
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Figure 2 . The numbers N s {p) of a constellation s are com- 
pletely described by figures like this. All constellations of j 
gaps have identical dominant terms in Theorem 4.2 so we 
line up our entries for constellations in columns indexed by 
j. For each constellation s we include its initial conditions: 
the first prime p for which s occurs in Q{p) and the condi- 
tions of the theorem hold; and the number of occurrences of 
s in Q{p). We indicate driving terms for the recurrence in 
Theorem |4.2| by directed arrows. Despite any disparities in 
initial conditions between two constellations of j gaps, the 
constellation with more driving terms will rapidly become 
more numerous. 



24 and 42, which themselves have no generators. Under the recursion we 
will eventually have 

N 2 (p) = N A {p) < N 6 (p) < N 8 (p). 



More surprising constellations include 42424 or 2,10,2 or even 
2,10,2,10,2. Due to the terms for the sequences 2462 and 2642, the con- 
stellation 2, 10,2 becomes more abundant than the constellation 242. These 
two equal additional terms in the recursion for 2, 10,2 fade in their signifi- 
cance compared to the multiplier [p — 4). By p = 89, these two terms are 
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contributing around one-half a percent of the total. Similarly, while the con- 
stellations 42424 and 2,10,2,10,2 both occur, ultimately 2,10,2,10,2 will 
be more abundant due to driving terms. 
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In the calculation for ^2,10,2 (p), we can also observe the necessity of the 
requirement that the sum of the elements in a constellation be less than 
2p. One initial condition for ^2,10,2 (p) is -^2462(7) = 3. Since there are four 
elements in the constellation 2462, we expect the number of these to grow as 
{p — 5), and without checking, we might assume that there would be 2 such 
constellations in Q{7). However, the sum of these elements is 14 = 2 * 7, and 
in the construction of G{7) a sum of 2 * 7 falls perfectly across one copy of 
2462. The result is that two of the gaps are closed in this single copy of 2462, 
letting an additional one of the seven original copies (from step R2) survive. 
Until p is greater than twice the sum of the elements in the constellation, we 
cannot be certain that each of the gaps in this constellation will be closed 
in distinct copies. 

From the work above, we have methods for tracking the exact number of 
copies of any constellation through the stages of Eratosthenes sieve. Unless 
the constellation occurs in G(5), it must initially result from the closing of 
gaps in longer constellations. When a constellation appears in some G(p), 



Lemma 4.1 and Theorem 4.2 provide the conditions and formulae for the 
exact number of this constellation that will appear in each stage of the sieve. 
This system is illustrated in Figure [2] 



5. Expected Constellations of Prime Numbers 

All of the work above is deterministic. We have identified a recursion that 
produces the cycle G(pk) of gaps between consecutive generators of Z mod 
Ufc. We are able to identify and count individual gaps and constellations 
as the recursion progresses. In this section we introduce two conjectures, to 
help us explore the potential of this approach. 

The quantities N s {p) provide exact counts of the constellation s in the 
cycles G{p) as we iterate on p. These counts indicate how often s occurs as a 
constellation among possible primes. We would like to use this information 
to draw conclusions about how often s occurs as a constellation among prime 
numbers. 
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To estimate how often s occurs as a constellation among prime numbers, 
we combine two insights. First, all constellations in G(pk) from pk+i to p\ , 1 
are constellations among primes. Second, the recursion suggests uniform 
distributions. 

Remark 5.1. In G{pk), all the gaps that occur after g^i = Pk+i — 1 and 
before p\ + i are actually gaps between prime numbers. For j > 1, ifTkj < 
p\ +1 , then g k j = g k+j . 

Constellations which fall in the interval [p, p 2 ] are constellations among 
primes. We want to understand the distribution of the N s (p) copies of a 
constellation s over the cycle from 1 to 11^. Observe that the concatenation 
step in the recursion produces uniformly distributed copies of every constel- 
lation. By using the interval [p,p 2 ] to sample these approximate uniform 
distributions of constellations, we can make estimates E s {p) of how often s 
occurs as a constellation among primes. 



5.1. Nearly Uniform Distributions. The recursion in Lemma 3.1 sug- 
gests that if we track the images of some gap g through several stages of 
Eratosthenes sieve, these images will be almost uniformly distributed in the 
fundamental cycle. In Q{jpk) pick any gap g = g k j. Step R2 of the recur- 
sion creates p k +i copies of this gap, uniformly distributed in the interval 
[l,Hfc + i]. Step R3 removes two of these copies of g. In Q(pk+2), step R2 
creates Pk+2 copies of the set of nearly uniformly distributed Pk+i — 2 copies 
of g. These Pk+2 copies are uniformly distributed in the interval [l,ILvf2]- 
After step R3, the (pk+2 — 2)(Pk+i — 2) images of g in Q(pk+2) are approx- 
imately uniformly distributed. As we continue applying the recursion, the 
abundant images of g are pushed toward uniformity by step R2 and trimmed 
symmetrically by step R3. In Figure [T] we can see this effect after only one 
stage of the sieve. 

The preceding observations apply to constellations as well. For a constel- 
lation, the distribution of its images may not well-approximate a uniform 
distribution until several of the recursions have been applied. If the con- 
stellation contains j gaps, then after step R2 in the recursion distributes p 
copies of s uniformly, step R3 trims j + 1 of these copies by closing gaps in 
the constellation. These j + 1 copies of s are trimmed in symmetrical fash- 
ion. Although j may initially be almost as large as p, through the recursion 
p grows while j remains fixed. Before long, the trimming of step R3 will not 
substantially disrupt the uniformity of the distribution enforced by step R2. 

These observations, about the effects of the recursion on the distribution 
of occurrences of a particular constellation, indicate a uniform distribution, 
but there is more work to be done in understanding the distribution of copies 
of s in Q{p). Further investigation will proceed along two lines: computer 
searches and statistical analyses. For our present purposes we leave the 
desired result as a conjecture. 
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Conjecture 5.2. Under the recursion in Lemma 3.1, all constellations in 
Q(p) of sum less than 2p tend toward a uniform distribution in Q{P) for all 
primes P 3> p. 

Our estimates of the frequency of occurrences of certain constellations 
require the uniform distribution asserted by this conjecture. However, other 
results, for example on some conjectures by Erdds and Turan [6], require a 
much weaker conjecture. 



Conjecture 5.3. Under the recursion in Lemma 3.1, all constellations in 
G(Pk) of sum less than 2p/ : +i and with fewer than Pk+i — 1 gaps occur infin- 
itely often as constellations among larger primes. 

From Lemma |4.1[ we know that copies of these constellations will sur- 



vive throughout the G{p), and from Theorem 4.2 that the number of copies 
will grow superexponentially. Our first conjecture postulates a distribution 
which enables us to count the copies which fall in the intervals {p, p 2 ] for 
every p. 

This first conjecture suggests that the third step in the recursion has an 
approximately uniform effect on the uniform distribution of copies created 
by the second step. The second weaker conjecture asserts only that for some 
subsequence of the primes, each interval [p,p 2 ] contains at least one copy 
of the constellation. Suppose the third step in the recursion removes more 
copies of a constellation from the ends of the cycle of gaps, so that the 
superexponential number of copies are clustered in the middle (recall that 
Q(p) is symmetric). This second conjecture postulates that occasionally a 
copy of the constellation falls into the interval [p,p 2 ]. 

To bolster these conjectures, we can step back and look at the aggregate 
population of all constellations. Not all constellations can accumulate in 
the middle of G{p). Some must fall in [p,p 2 ]. If the distribution for some 
particular constellation is forever biased strongly toward the middle of the 
cycle, then the distributions of some other constellations must compensate 
for this bias. That is, if some constellations fall below the expected number 
of occurrences, then other constellations must exceed the expectations. 



5.2. Estimates. For a given constellation s, we use the approximate unifor- 
mity of the N s {pk) copies of s in Q{pk) to estimate the number that survive 
as constellations among primes. There are gaps in G(pk)- The average 
length n of a gap in G{pk) is 

ITfc 

a = —— ~ e ' In p. 

The limit is due to Mertens, and its derivation is recorded in [11]. In an 
interval I in [1 , il^] , we can expect there to be |ij //i gaps. Thus the expected 
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<S> k -] + l p. 

The first factor is the fraction of gaps in G(pk) that start a copy of s, and 
the second factor is the expected number of gaps in the interval /. As k 
gets large, the constant correction of j — 1 in the first denominator becomes 
inconsequential. 

Of particular interest to us is the interval \pk+i,p\ + ]\'- 
(4) E k s = E(s, foH-i.pL.il) ~ " P*+i) G 



As a first application of this approach, we estimate the number of twin 
primes between p and p 2 . 



El 



■Pfc+i, 
Pfc+i 



-Pk+i 



In 2 a 



Our estimate for the number of twin primes must be contrasted with 
those estimates provided by Hardy and Littlewood, which are supported by 
vast computation (2] [15] : 

#{ 9t =2 : Pi e[ Pk+1 ,pl +l }} ~ 2e~^ C2 %^i 
vs. 

rN dx o„ N 



#{ 9i = 2 : Pi € [2,7V]} ~ 202/^5^ ~ 2c 2 



(In TV) 2 



Some actual counts C\ of the single-gap constellation 2 occurring be- 
tween and p\ are tabled below. These actual counts are compared to the 
estimates from the preceding stage of the sieve, E 2 ~ , and to the Hardy- 
Littlewood estimate HL\. The right half of the table compares the counts 
and estimates for the single-gap constellations 6 and 8. 
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Pk 


C\ 


Et 1 


RL\ 


c* 


^6 


Cf 




11 


8 


8 


4 


7 


7 


2 


1 


13 


9 


9 


6 


10 


10 


1 


2 


101 


202 


181 


152 


296 


286 


104 


96 


199 


574 


530 


457 


898 


878 


335 


312 


499 


2557 


2470 


2112 


4099 


4263 


1672 


1579 


1009 


8278 


8217 


6997 


13715 


14521 


5643 


5506 


1999 


26777 


26742 


22788 


44785 


48159 


18762 


18601 


2503 


39326 


39558 


33717 


66333 


71628 


27924 


27811 


4999 


130343 


133426 


113623 


223691 


245166 


96283 


96528 


10007 


440666 


457406 


389427 


769389 


850965 


334491 


338959 


12503 


653634 


681311 


579620 


1146148 


1271986 


499702 


508315 


14939 


895790 


936917 


797157 


1576337 


1753990 


689398 


702709 



Twin primes occur in interesting constellations, for example the prime 
quadruplets constellation s = 242. This constellation occurs in G(5). With 
A^242(5) = 1, we calculate the number of expected occurrences of the 
constellation 242 between p and p 2 . Under the recursion, N242(pk+i) = 
(Pk+i — 4)A r 242(pfc)- So, under our conjecture of uniformity, the expected 
number of these constellations between pu+i and p\ +l is 

(5) ^42 = 



with C4 = 

This constant C4 can be derived from the estimates ([!]) and (j3l). See for 
example (22] . 

The extensive computations of [T7j support the Hardy-Littlewood esti- 
mates [22] for s = 242: 

27 f N dx 27 N 

Calculating E\ 10 2 is more involved because of the driving terms from 
the constellations 2462 and 2642. We note that for all p > 13, N 2 ,w,2(p) > 
^242 ip)-, but that both have the same dominating factor of p — 4. 



^'"^ life -*)/•» 



Pj=5 K 

27 _ 47 (Pfc +1 - Pk+i) 
l In p k 

-0.30749... 
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Pk 


/~ik 

Ly 242 


TTlfc— 1 

^242 




/~~ik 

u 2,10,2 


pfc-i 

^2,10,2 


^2,10,2,10,2 


pfc-i 

-^2, 10,2, 10,2 


11 


2 


2 

















13 


1 


1 





1 


1 








101 


10 


9 


5 


18 


16 


1 


1 


199 


20 


20 


12 


35 


37 


2 


2 


499 


56 


67 


42 


118 


135 


5 


6 


1009 


167 


182 


114 


325 


377 


10 


13 


1999 


459 


490 


308 


873 


1041 


25 


29 


2503 


620 


683 


431 


1249 


1464 


38 


39 


4999 


1714 


1948 


1228 


3621 


4255 


84 


95 


10007 


4760 


5712 


3604 


10502 


12686 


212 


243 


12503 


6657 


8118 


5114 


14872 


18113 


300 


331 


14939 


8777 


10753 


6777 


19556 


24079 


378 


424 



The tables above sample the output from a program that searches for 
prime constellations, up to the limits of long integer arithmetic and available 
RAM. The largest prime for which constellations can be counted accurately 
by this program is 14939. Extending these computational results is one open 
avenue of research. 

Correlations among copies of a constellation are somewhat preserved by 
the recursion. Our conjecture 5.2 does not take into account these cor- 
relations. For example, the constellation 2,10,2,10,2 occurs in G(7) and 
thereafter. This constellation contains two occurrences of 2, 10, 2 and will 
consequently introduce jumps in Cfc(2,10,2). Any deviations from a uni- 
form distribution are altered during the recursions. These changes to the 
distribution are therefore occurring on the scale of Hfc . As a reference point 
for this scale, the results tabulated in this paper, through p = 14939 require 
only Ilg, with pg = 23. 



5.3. Refinements. Based on our conjecture about uniformity, we could 
calculate the expected number of a constellation s that fall between Pk+\ 
and p\ +l as 



E 



(Pfe+i 



Pk+i) 



N s {Pk) 
Ilk 



For small primes we can improve these estimates. The symmetry of Q{p) 
allows us to refine the denominator to — 2pk + i. Observations about the 
middle of G(p) allow a further refinement in the denominator. Our estimates 
do not need to use the entire interval [1,11^]. The cycle Q{p) is symmetric, 
and its middle constellation is 2 3 . . . 42424 . . . 2 3 . Thus we could work over 
the interval [p^, H&/2 — 2- 7+1 ] instead; the term 2 J+1 is the smallest power of 
2 greater than Pk+i- 

These refinements may require us to decrement the numerator slightly, 
adjusting for occurrences we know we've excluded, e.g. the final 2. We could 
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adjust the interval \p,p 2 ] to \p,p 2 — I] to adjust for the sum I of the gaps 
in the constellation. These refinements are constant or of order p, which 
become inconsequential rapidly in the face of factors like Ilk- 

6. Implications for other problems on constellations 

Although Hardy and Littlewood's prime £;-tuple conjecture is formulated 
about differences between primes, we show here that the conjecture has an 
equivalent formulation as a conjecture on constellations. 

To the /c-tuple we associate a constellation. Without loss of generality 
we can assume the fc-tuple is in ascending order. From this fc-tuple we can 
derive a sequence of k — 1 differences: 

(6) s b = (b 2 - h), (6 3 - b 2 ), • • • , Ofc - h-i)- 

This sequence is the constellation we want to associate with the /c-tuple. 
Notice, however, that the /c-tuple conjecture does not require that the primes 
be consecutive. So while the fc-tuple conjecture can be trivially reformulated 
for a sequence of differences, we have a little work to do to establish that 
this is equivalent to a conjecture on constellations. 

Lemma 6.1. The k-tuple conjecture is equivalent to the conjecture that the 
constellation Sf, occurs in Q{p) for some p with 2p > — b\. 

Proof. For the /c-tuple B = (b±, . . . ,bk) we put the bi in ascending order, 
and then associate this with the constellation 

Sb = O2 - h), (63 - 6 2 ), • • • , (bk - bfc-i)- 

( < ^=) If the constellation Sb occurs in Q{p) with 2p > bk — b\, then this 
constellation occurs infinitely often as a constellation among primes, and 
the £;-tuple conjecture would be true for B. 

(=^) Conversely, suppose the /c-tuple conjecture were true for B. Then 
the sequence of differences in Q occurs infinitely often among the primes. 
Let p be a prime large enough so that two conditions hold: 2p > bk — i>i, 
and one instance of the /c-tuple of primes falls within Q{p). Then in Q{p) 
there is a constellation 

Gl = 920 ■ ■ ■ 92n 2 9m ■ ■ ■ 93n 3 ■ ■ • 9k0 ■ ■ ■ 9kn k 

such that Y^=o9ji = — bj-\. 



As the recursion of 3.1 proceeds, we identify a sequence of constellations 
Gi which ends with Gn equal to the constellation s&. In the i th iteration 
we pick an image of Gi in which one addition gjj + occurs. By 



Theorem 3.3 this addition occurs, and due to the size of p, this is the only 
addition that occurs in this image of Gi; so the desired image exists, and we 
take the resulting constellation as Gi + \. 
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For N = 7i2 + ns + ■ ■ ■ + Gat is the constellation s&. □ 

In [22], Riesel uses Hardy and Littlewood's work [TO] to estimate the 
numbers of occurrences of the constellations 424 and 242. Nicely [T7] has 
tested the estimate for 242 computationally. The work in Section |5.2| sup- 
ports the asymptotic order of the estimates. In G(5) we see the reason for 
Nicely 's modularity condition 30n + 11. By applying the recursion a few 
times, we can sort copies of 242 into branches with more restrictive mod- 
ularity conditions. For any other constellation s we can observe analogous 
modularity conditions based on the location of copies of s in Q(jp). Finally, 
£/(5) supports the conjecture [22] that 424 should occur approximately twice 
as often as 242. 



If our weaker conjecture 5.3 is correct, we can provide solutions to some 



problems posed by Erdos and Turan (BJ: 

i) Spikes. What are limsup g n /g n +i and liminf g n /g n +i? 

ii) Oscillation. Is there an no such that for all k > 1, g no +2k-i < 9n +2k 
and g no +2k > 5 f n +2fc+i? 

iii) Superlinearity. Can gj < gj+i < . . . < gj+k have infinitely many 
solutions for every kl 

Spikes. We will construct a sequence {s^} of constellations, such that s& 
occurs in G{pk), Sfc is of the form <7fc infc 2, and gk,n k 1S strictly increasing in k. 
Consider constellations in Q{pk) of the form g2. For we take a pair with 



the maximal g. In G(7), for example, this pair is 10,2. By Theorem 3.3 
in Q(jpk+x) one of the images of this pair g2 has the gap g added to the 
preceding gap. This shows that the sequence gk,n k is strictly increasing in 
k. Thus 

]hnsupg n /g n+ i > limsupO^/2) = oo. 
By the symmetry of Q (p) , 

liminf g n /g n +i < Hminf (2/g k>nk ) = 0. 

Oscillation. An affirmative answer to this question asserts that eventu- 
ally the gaps between primes oscillate in size. We provide a counterexample. 
In G{7), the constellation 24682 occurs. Our conjecture implies that this con- 
stellation will occur infinitely often as a constellation among prime numbers. 
This constellation is incompatible with the proposed oscillation. 

Superlinearity. This third problem inquires about sequences of consec- 
utive primes that exhibit superlinear growth. We provide examples of the 
requested constellations, from which our conjecture implies an affirmative 



answer to this problem. From Remark 3.2 we recall that in the middle of 
G(p) there occurs a constellation of powers of 2: 2 k , . . . ,42424, . . . , 2 k . We 
can make this constellation arbitrarily long by taking p large enough. The 
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right half of this constellation is an increasing sequence of k gaps. Assum- 
ing our conjecture holds, this constellation will occur infinitely often as a 
constellation among primes. 

Our work above also provides insight into the search for dense clusters 
of primes [22 . One line of research looks for constellations among the small 



primes to show up again later, among larger primes. Theorem 3.3 and 
Lemma |4JJ indicate limits on these searches. 



7. Conclusion 

We have studied directly the cycle of gaps produced by each stage of 
Eratosthenes Sieve. The work above can be divided into two parts: a de- 
terministic recursion for G(p), followed by statistical estimates under an 
assumption of approximately uniform distributions. 

Based on the recursion for the p-sieves, we have conjectured that all 
constellations, which occur in Q{p) for some prime p and the sum of whose 
gaps is less than 2p, tend toward a uniform distribution in later stages of 
the sieve. From this conjecture, we can make estimates of the number of 
occurrences of a constellation between p and p 2 for the new prime p at each 
stage of the sieve; all constellations which occur before p 2 actually occur as 
constellations between primes. 

We posed two conjectures, either of which imply that every sufficiently 
small constellation in Q{p) occurs infinitely often as a constellation among 
prime numbers. 

Our stronger conjecture asserts that under the recursion of Eratosthenes 
sieve, the images of a constellation are distributed approximately uniformly 
in the fundamental cycle. This conjecture allows us to estimate how many 
copies of a constellation s occur in the interval [p,p 2 ], which estimates com- 
pare favorably to the results from our initial computer searches. For single 
gaps or for constellations consisting only of 2's and 4's, other estimates are 
available pUl El [22j [T7], and these estimates agree at first order to ours. 



This conjecture 5.2 on uniformity provides many open problems. In our 
estimates, we use the interval [1, n&] as the sample space for the conjectured 
uniform distribution. Simple observations about the structure of Q{p) allow 
us to adjust the sample space. These refinements are on the order of p and 
so improve our estimates only for very small prime numbers. Are there any 
refinements that will improve our estimates in the large? 

While the statistics in these estimates may be refined, with the weaker 



Conjecture |5.3| we have addressed three problems posed by Erdos and Turan 

El. 
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